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DETAILED ACTION 
Continued Examination Under 37 CFR 1.114 

1. A request for continued examination under 37 CFR 1.114, including the fee set 
forth in 37 CFR 1 .17(e), was filed in this application after final rejection. Since this 
application is eligible for continued examination under 37 CFR 1.114, and the fee set 
forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action 
has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 9/18/07 
has been entered. 

Response to Arguments 

2. Applicant's arguments with respect to claims 1, 9, 13, 15, 18 and 21 have been 
fully read and considered but are moot in view of the new ground(s) of rejection. 

Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 1 02 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

4. Claims 1-5, 7-16, 18-22 and 27-35 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Lee (5,748,789) in view of Linzer (6,094,457). 

Regarding claim 1, Lee discloses a method of encoding video content, the 
method comprising: 

assigning a predefined model to each of at least two video content portions of the 
video content (col.42, ln.47-61 ; note each video object has an arbitrary shape, and that 



Application/Control Number: 09/874,872 Page 3 

Art Unit: 2621 

each video object is predefined according to its shape, thus, each video object or video 
portion is assigned a predefined encoder model by a mask of alpha values or a binary 
mask; in fig.27A, note there are at least two video portions, elements 972, 974, 976, 
978, 980 and 982, where there are triangular portions that consist of each of elements 
972, 974, 976, 978, 980 and 982 to form a model of a person 970; fig. 35, note frame 
1538 consists of multiple portions 1540, 1542, 1544a and 1544b); and 

routing each of the at least two video content portions to one of a plurality of 
encoders based on a respective one of the predefined models assigned to each of the 
at least two video content portions (col.42, ln.47-61, Lee discloses that each video 
object has an arbitrary shape, and that each video object is predefined according to its 
shape, thus, each video object or video portion is routed or assigned a predefined 
encoder model by a mask of alpha values or a binary mask), 

wherein the assigning a predefined model to each of the at least two video 
content portions (col.42, ln.47-61; note each video object has an arbitrary shape, and 
that each video object is predefined according to its shape, thus, each video object or 
video portion is assigned a predefined encoder model by a mask of alpha values or a 
binary mask; in fig.27A, note there are at least two video portions, elements 972, 974, 
976, 978, 980 and 982, where there are triangular portions that consist of each of 
elements 972, 974, 976, 978, 980 and 982 to form a model of a person 970; fig. 35, note 
frame 1538 consists of multiple portions 1540, 1542, 1544a and 1544b) further 
comprises: 
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comparing descriptors associated with each of the at least two video content 
portions with corresponding stored model descriptors from a plurality of predefined 
content models (col. 51, ln.4-59; note there are plural flags that can aid the 
determination of the video portions of the video content; col. 50, In. 18-41, Lee discloses 
the comparison of the frames, in particular, the comparison is done with the shape of 
the first frame that contains its respective video portions and the shape of the second 
frame that contains its respective video portions), and 

assigning each of the at least two video content portions to a respective best 
content model from the plurality of predefined content models based on the comparing 
of the descriptors (col. 50, In. 27-37, the error computed from the inter-frame shape 
coding is then applied to assign the best content model based on the interi'rame 
comparison of the shapes between the first and second frame data). 

Lee does not specifically disclose wherein each of the at least two video portions 
comprise a temporal, multiframe segment of the video content. However, Linzer 
teaches that each of the at least two video portions comprise a temporal, multiframe 
segment of the video content (fig. 3, Linzer discloses that elements 32-1 to 32-n are the 
plural MPEG encoders, as disclosed in col. 6, In. 56-58, in that each of the encoders 32-1 
to 32-n compress a temporal, multiframe segment of the available video content known 
in MPEG as a group of frames (GOPs) or a group of frames organized in a temporal, 
multiframe or grouped frames that can be partitioned into multiple frame segments, and 
that the multiple frame segments are compressed by encoders 32-1 to 32-n). 
Therefore, it would have been obvious to one of ordinary skill in the art to combine the 
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teachings of Linzer into the system of Lee for permitting accurately, efficiently encoding 
multiple video streaming image data while maintaining high image quality (Linzer col.4, 
ln.39-42). 

Regarding claim 2, Lee discloses the at least two video content portions are 
video segments (col.42, ln.47-61; note each video object has an arbitrary shape, and 
that each video object is predefined according to its shape, thus, each video object or 
video portion is assigned a predefined encoder model by a mask of alpha values or a 
binary mask; in fig.27A, note there are at least two video portions, elements 972, 974, 
976, 978, 980 and 982, where there are triangular portions that consist of each of 
elements 972, 974, 976, 978, 980 and 982 to form a model of a person 970; fig. 35, note 
frame 1538 consists of multiple portions 1540, 1542, 1544a and 1544b). 

Regarding claim 3, Lee discloses the at least two video content portions are 
video subsegments (col.42, ln.47-61; note each video object has an arbitrary shape, 
and that each video object is predefined according to its shape, thus, each video object 
or video portion is assigned a predefined encoder model by a mask of alpha values or a 
binary mask; in fig.27A, note there are at least two video portions, elements 972, 974, 
976, 978, 980 and 982, where there are triangular portions that consist of each of 
elements 972, 974, 976, 978, 980 and 982 to form a model of a person 970; fig. 35, note 
frame 1538 consists of multiple portions 1540, 1542, 1544a and 1544b). 

Regarding claim 4, Lee discloses the at least two video content portions are 
video regions of interest (fig. 33, element 1502, col.42, In. 34-46, and fig. 35, note video 
object information is extracted and segmented from the input video sequence, and 
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segments and subsegments of the regions of interest are identified, and in fig. 35 
discloses extracting multiple video objects 1540, 1542 and 1544b; fig.27A, note there 
are at least two video portions, elements 972, 974, 976, 978, 980 and 982; fig. 35, note 
frame 1538 consists of multiple portions 1540, 1542, 1544a and 1544b). 

Regarding claim 5, Lee discloses a generic encoder model (fig. 33 and col.42. 
In. 62-65; note object coders 1504-1508 encode video portions associated with the 
generic model; and fig. 36, note the coder shown is used to encode the video portions). 

Regarding claim 7, Lee discloses one of the plurality of predefined content 
models includes a generic video content model (fig. 33 and col.42, In. 62-65; note object 
coders 1504-1508 encode video portions associated with the generic model; and fig. 36, 
note the coder shown is used to encode the video portions). 

Regarding claim 8, Lee discloses wherein assigning a predefined model to each 
of at least two video content portions of the video content further comprises assigning 
the generic video content model to a video content portion of the at least two video 
content portions if none of the other models from the plurality of predefined content 
models is assigned to the video content portion (col.42, ln.47-61; note each video object 
has an arbitrary shape, and that each video object is predefined according to its shape, 
thus, each video object or video portion is assigned a predefined encoder model by a 
mask of alpha values or a binary mask; in fig.27A, note there are at least two video 
portions, elements 972, 974, 976, 978, 980 and 982, where there are triangular portions 
that consist of each of elements 972, 974, 976, 978, 980 and 982 to form a model of a 
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person 970; fig. 35, note frame 1538 consists of multiple portions 1540, 1542, 1544a and 
1544b; ). 

Regarding claim 9, Lee discloses a method of encoding video content, the 
method comprising: 

identifying video subsegments and regions of interest within at least two video 
portions from the video content (fig. 33, element 1502, col.42, In. 34-46, and fig. 35, note 
video object information is extracted and segmented from the input video sequence, 
and segments and subsegments of the regions of interest are identified, and in fig. 35 
discloses extracting multiple video objects 1540, 1542 and 1544b; fig.27A, note there 
are at least two video portions, elements 972, 974, 976, 978, 980 and 982; fig. 35, note 
frame 1538 consists of multiple portions 1540, 1542, 1544a and 1544b); 

assigning a predefined encoder model to each at least two video portion 
according to a characteristic of each of the at least two video portions, the predefined 
encoder model being chosen from a plurality of predefined models or a generic model 
(col.42, ln.47-61; note each video object has an arbitrary shape, and that each video 
object is predefined according to its shape, thus, each video object or video portion is 
assigned a predefined encoder model by a mask of alpha values or a binary mask; in 
fig.27A, note there are at least two video portions, elements 972, 974, 976, 978, 980 
and 982, where there are triangular portions that consist of each of elements 972, 974, 
976, 978, 980 and 982 to form a model of a person 970; fig. 35, note frame 1538 
consists of multiple portions 1540, 1542, 1544a and 1544b); 
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encoding each of the at least two video portions associated with the generic 
encoder model with a generic encoder (fig. 33 and col.42, In. 62-65; note object coders 
1504-1508 encode video portions associated with the generic model; and fig. 36, note 
the coder shown is used to encode the video portions); and 

encoding each of the at least two video portions associated with the plurality of 
predefined encoder models with an encoder chosen from a plurality of encoders, each 
of the plurality of encoders being associated with one of the plurality of predefined 
models (fig. 33 and col.43, In. 10-15; note the multiplexer 1510 is used to multiplex and 
encode video portions from plural video object encoders 1504-1508; and fig. 36, note the 
coder shown is used to encode the video portions), wherein 

the assigning a predefined encoder model to each of the at least two video 
portions according to a characteristic of each of the at least two video portions (col.42, 
In. 47-61 ; note each video object has an arbitrary shape, and that each video object is 
predefined according to its shape, thus, each video object or video portion is assigned a 
predefined encoder model by a mask of alpha values or a binary mask; in fig.27A, note 
there are at least two video portions, elements 972, 974, 976, 978, 980 and 982, where 
there are triangular portions that consist of each of elements 972, 974, 976, 978, 980 
and 982 to form a model of a person 970; fig. 35, note frame 1538 consists of multiple 
portions 1540, 1542, 1544a and 1544b) further comprises: 

comparing first descriptors associated with the at least two video portions and 
second descriptors associated with the subsegments and the regions of interest with 
corresponding stored model descriptors from a plurality of predefined content models 
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(col. 51, ln.4-59; note there are plural flags that can aid the determination of the video 
portions of the video content; col. 50, In. 18-41, Lee discloses the comparison of the 
frames, in particular, the comparison is done with the shape of the first frame that 
contains its respective video portions and the shape of the second frame that contains 
its respective video portions), and 

assigning each of the at least two video content portions to a respective best 
content model based on the comparing of the first and the second descriptors (col. 50, 
In. 27-37, the error computed from the inter-frame shape coding is then applied to assign 
the best content model based on the interframe comparison of the shapes between the 
first and second frame data). 

Lee does not specifically disclose wherein each of the at least two video portions 
comprise a temporal, multiframe segment of the video content. However, Linzer 
teaches that each of the at least two video portions comprise a temporal, multiframe 
segment of the video content (fig. 3, Linzer discloses that elements 32-1 to 32-n are the 
plural MPEG encoders, as disclosed in coL6, In. 56-58, in that each of the encoders 32-1 
to 32-n compress a temporal, multiframe segment of the available video content known 
in MPEG as a group of frames (GOPs) or a group of frames organized in a temporal, 
multiframe or grouped frames that can be partitioned into multiple frame segments, and 
that the multiple frame segments are compressed by encoders 32-1 to 32-n). 
Therefore, it would have been obvious to one of ordinary skill in the art to combine the 
teachings of Linzer into the system of Lee for permitting accurately, efficiently encoding 
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multiple video streaming image data while maintaining high image quality (Linzer col.4, 
ln.39-42). 

Regarding claim 10, Lee discloses producing the first descriptors associated with 
the at least two video portions of the video content (col. 51, ln.4-59; note there are plural 
flags that can aid the determination of the video portions of the video content; col. 50, 
In. 18-41, Lee discloses the comparison of the frames, in particular, the comparison is 
done with the shape of the first frame that contains its respective video portions and the 
shape of the second frame that contains its respective video portions); producing the 
second descriptors associated with the video subsegments and the regions of interest 
(col. 51, ln.4-59; note there are plural flags that can aid the determination of the video 
portions of the video content; col. 50, In. 18-41, Lee discloses the comparison of the 
frames, in particular, the comparison is done with the shape of the first frame that 
contains its respective video portions and the shape of the second frame that contains 
its respective video portions). 

Regarding claim 1 1 , Lee discloses encoding the first and second descriptors 
(fig. 33 and col.43, In. 10-1 5; note the multiplexer 1510 is used to multiplex and encode 
video portions from plural video object encoders 1504-1508; and fig. 36, note the coder 
shown is used to encode the video portions, that includes coding the first and second 
descriptors). 

Regarding claim 12, Lee discloses wherein the first and second descriptors are 
used to determine whether the generic encoder or an encoder from a plurality of 
encoders was used to encode the at least two video portions (fig. 33 and col.43, In. 10- 
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15; note the multiplexer 1510 is used to multiplex and encode video portions from plural 
video object encoders 1504-1508; and fig. 36, note the coder shown is used to encode 
the video portions, that includes coding the first and second descriptors). 

Regarding claim 13, Lee discloses a method of encoding video content, the 
method comprising: 

if a video portion of at least two video portions of the video content relates to one 
of a plurality of predefined encoder models (fig. 33, element 1502, col.42, In. 34-46, and 
fig. 35, note video object information is extracted and segmented from the input video 
sequence, and segments and subsegments of the regions of interest are identified, and 
in fig. 35 discloses extracting multiple video objects 1540, 1542 and 1544b; fig.27A, note 
there are at least two video portions, elements 972, 974, 976, 978, 980 and 982; fig. 35, 
note frame 1538 consists of multiple portions 1540, 1542, 1544a and 1544b), 

assigning the video content portion to a related, predefined encoder model 
chosen from the plurality of predefined encoder models (col.42, ln.47-61; note each 
video object has an arbitrary shape, and that each video object is predefined according 
to its shape, thus, each video object or video portion is assigned a predefined encoder 
model by a mask of alpha values or a binary mask; in fig.27A, note there are at least 
two video portions, elements 972, 974, 976, 978, 980 and 982, where there are 
triangular portions that consist of each of elements 972, 974, 976, 978, 980 and 982 to 
form a model of a person 970; fig. 35, note frame 1538 consists of multiple portions 
1540, 1542, 1544a and 1544b); 
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if a video content portion of the at least two video content portions of the video 
content does not relate to one of the plurality of predefined encoder models, assigning 
the video content portion to a generic encoder model (fig. 33 and col.42, In. 62-65, Lee 
discloses the object coders 1504-1508 are used to encode the video portions 
associated with the generic model, in fig. 36, the coder shown is used to encode the 
video portions in a generic manner or model); 

encoding each of the at least two video content portions associated with the 
generic encoder model using a generic encoder (fig. 33 and col.42, ln.62-65; note object 
coders 1504-1508 encode video portions associated with the generic model; and fig. 36, 
note the coder shown is used to encode the video portions in a generic manner or 
model); and 

encoding each of the at least two video content portions associated with one of 
the predefined encoder models with an encoder from a plurality of encoders (fig. 33 and 
col.43, In. 10-1 5; note the multiplexer 1510 is used to multiplex and encode video 
portions from plural video object encoders 1504-1508; and fig. 36, note the coder shown 
is used to encode the video portions), 

wherein the assigning the video content portion to a related, predefined encoder 
model chosen from the plurality of predefined encoder models (col.42, ln.47-61; note 
each video object has an arbitrary shape, and that each video object is predefined 
according to its shape, thus, each video object or video portion is assigned a predefined 
encoder model by a mask of alpha values or a binary mask; in fig.27A, note there are at 
least two video portions, elements 972, 974, 976, 978, 980 and 982, where there are 
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triangular portions that consist of each of elements 972, 974, 976, 978, 980 and 982 to 
form a model of a person 970; fig. 35, note frame 1538 consists of multiple portions 
1540, 1542, 1544a and 1544b) further comprises: 

comparing descriptors associated with the video content portion with 
corresponding stored model descriptors from a plurality of predefined encoder models 
(col. 51 , ln.4-59; note there are plural flags that can aid the determination of the video 
portions of the video content; col. 50, In. 18-41, Lee discloses the comparison of the 
frames, in particular, the comparison is done with the shape of the first frame that 
contains its respective video portions and the shape of the second frame that contains 
its respective video portions), and 

assigning the video content portion to a best encoder model from the plurality of 
predefined encoder models based on the comparing of the descriptors (col. 50, In. 27-37, 
the error computed from the inter-frame shape coding is then applied to assign the best 
content model based on the interframe comparison of the shapes between the first and 
second frame data). 

Lee does not specifically disclose wherein each of the at least two video portions 
comprise a temporal, multiframe segment of the video content. However, Linzer 
teaches that each of the at least two video portions comprise a temporal, multiframe 
segment of the video content (fig. 3, Linzer discloses that elements 32-1 to 32-n are the 
plural MPEG encoders, as disclosed in col. 6, In. 56-58, in that each of the encoders 32-1 
to 32-n compress a temporal, multiframe segment of the available video content known 
in MPEG as a group of frames (GOPs) or a group of frames organized in a temporal, 
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multiframe or grouped frames that can be partitioned into multiple frame segments, and 
that the multiple frame segments are compressed by encoders 32-1 to 32-n). 
Therefore, it would have been obvious to one of ordinary skill in the art to combine the 
teachings of Linzer into the system of Lee for permitting accurately, efficiently encoding 
multiple video streaming image data while maintaining high image quality (Linzer col.4, 
ln.39-42). 

Regarding claim 14, Lee discloses wherein each encoder from a plurality of 
encoders is associated with one of the predefined encoder models of the plurality of 
predefined encoder models (fig. 33 and col.43, In. 10-1 5; note the multiplexer 1510 is 
used to multiplex and encode video portions from plural video object encoders 1504- 
1508; and fig. 36, note the coder shown is used to encode the video portions). 

Regarding claim 15, Lee discloses a method of encoding video content divided 
into a at least two portions (fig. 33, element 1502, col.42. In. 34-46, and fig. 35, note video 
object information is extracted and segmented from the input video sequence, and 
segments and subsegments of the regions of interest are identified, and in fig, 35 
discloses extracting multiple video objects 1540, 1542 and 1544b; fig.27A, note there 
are at least two video portions, elements 972, 974, 976, 978, 980 and 982; fig. 35, note 
frame 1538 consists of multiple portions 1540, 1542, 1544a and 1544b), each of the at 
least two portions being associated with either a generic encoder model or an encoder 
model chosen from a plurality of predefined encoder models (fig. 33 and col.42, In. 62-65; 
note object coders 1504-1508 encode video portions associated with the generic model; 
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and fig. 36, note the coder shown is used to encode the video portions), the method 
comprising: 

comparing descriptors associated with the at least two portions with 
corresponding stored model descriptors from a plurality of predefined encoder models 
(coL51, in.4-59; note there are plural flags that can aid the determination of the video 
portions of the video content; col. 50, In. 18-41, Lee discloses the comparison of the 
frames, in particular, the comparison is done with the shape of the first frame that 
contains its respective video portions and the shape of the second frame that contains 
its respective video portions); 

assigning each of the at least two portions to a respective best encoder model 
from the plurality of predefined encoder models based on the comparing of the 
descriptors (col. 50, In. 27-37, the error computed from the inter-frame shape coding is 
then applied to assign the best content model based on the interframe comparison of 
the shapes between the first and second frame data); 

routing each of the at least two portions that is not assigned a respective best 
encoder model from the plurality of encoder models to a generic encoder (fig. 33 and 
col.42, In, 62-65, Lee discloses the object coders 1504-1508 are used to encode the 
video portions associated with the generic model, in fig. 36, the coder shown is used to 
encode the video portions in a generic manner or model); and 

routing each of the at least two portions assigned to the respective best encoder 
model of the plurality of predefined encoder models to an encoder associated with the 
respective best encoder model (col. 50, In. 27-37, the error computed from the inter- 
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frame shape coding is then applied to assign the best content model based on the 
interframe comparison of the shapes between the first and second frame data). 

Lee does not specifically disclose wherein each of the at least two video portions 
comprise a temporal, multiframe segment of the video content. However, Linzer 
teaches that each of the at least two video portions comprise a temporal, multiframe 
segment of the video content (fig. 3, Linzer discloses that elements 32-1 to 32-n are the 
plural MPEG encoders, as disclosed in col.6, In. 56-58, in that each of the encoders 32-1 
to 32-n compress a temporal, multiframe segment of the available video content known 
in MPEG as a group of frames (GOPs) or a group of frames organized in a temporal, 
multiframe or grouped frames that can be partitioned into multiple frame segments, and 
that the multiple frame segments are compressed by encoders 32-1 to 32-n). 
Therefore, it would have been obvious to one of ordinary skill in the art to combine the 
teachings of Linzer into the system of Lee for permitting accurately, efficiently encoding 
multiple video streaming image data while maintaining high image quality (Linzer col.4, 
ln.39-42). 

Regarding claim 16, Lee discloses wherein each encoder from a plurality of 
encoders is optimized for each predefined encoder model of the plurality of encoder 
models (fig. 33 and col.43, In. 10-1 5; note the multiplexer 1510 is used to multiplex and 
encode video portions from plural video object encoders 1504-1508; and fig. 36, note the 
coder shown is used to encode the video portions, thus optimizing the encoders for 
each predefined model of plural encoder models). 
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Regarding claim 18, Lee discloses a method of producing a bitstream coded 
according to video content, the method comprising: 

associating each of at least two portions of the video content to either a generic 
encoder model or a predefined encoder model chosen from a plurality of predefined 
encoder models (col.42, ln.47-61; note each video object has an arbitrary shape, and 
that each video object is predefined according to its shape, thus, each video object or 
video portion is assigned a predefined encoder model by a mask of alpha values or a 
binary mask; in fig.27A, note there are at least two video portions, elements 972, 974, 
976, 978, 980 and 982, where there are triangular portions that consist of each of 
elements 972, 974, 976, 978, 980 and 982 to form a model of a person 970; fig. 35, note 
frame 1538 consists of multiple portions 1540, 1542, 1544a and 1544b); 

routing each of the at least two portions associated with the generic encoder 
model to a generic encoder (fig. 33 and col.42, In. 62-65; note object coders 1504-1508 
encode video portions associated with the generic model; and fig. 36, note the coder 
shown is used to encode the video portions in a generic manner or model); and 

routing each of the at least two portions associated with an encoder model of the 
plurality of predefined encoder models to one of a plurality of encoders, wherein each 
encoder of the plurality of encoders is associated with one of the predefined encoder 
models (fig. 33 and col.43. In. 10-1 5; note the multiplexer 1510 is used to multiplex and 
encode video portions from plural video object encoders 1504-1508; and fig. 36, note the 
coder shown is used to encode the video portions), 
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wherein the associating each of the at least two portions of the video content to 
either a generic encoder model or a predefined encoder model chosen from a plurality 
of predefined encoder models (col.42, ln.47-61; note each video object has an arbitrary 
shape, and that each video object is predefined according to its shape, thus, each video 
object or video portion is assigned a predefined encoder model by a mask of alpha 
values or a binary mask; in fig.27A, note there are at least two video portions, elements 
972, 974, 976, 978, 980 and 982, where there are triangular portions that consist of 
each of elements 972, 974, 976, 978, 980 and 982 to form a model of a person 970; 
fig. 35, note frame 1538 consists of multiple portions 1540, 1542, 1544a and 1544b) 
further comprises: 

comparing descriptors associated with each of the at least two portions with 
corresponding stored model descriptors from the plurality of predefined encoder models 
(col. 51, ln.4-59; note there are plural flags that can aid the determination of the video 
portions of the video content; col. 50, In. 18-41, Lee discloses the comparison of the 
frames, in particular, the comparison is done with the shape of the first frame that 
contains its respective video portions and the shape of the second frame that contains 
its respective video portions), and 

associating each of the at least two portions with a respective best encoder 
model from the plurality of predefined encoder models or the generic encoder model 
based on the comparing of the descriptors (col. 50, In. 27-37, the error computed from 
the inter-frame shape coding is then applied to assign the best content model based on 
the interframe comparison of the shapes between the first and second frame data). 
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Lee does not specifically disclose wherein each of the at least two video portions 
comprise a temporal, multiframe segment of the video content. However, Linzer 
teaches that each of the at least two video portions comprise a temporal, multiframe 
segment of the video content (fig. 3, Linzer discloses that elements 32-1 to 32-n are the 
plural MPEG encoders, as disclosed in col. 6, In. 56-58, in that each of the encoders 32-1 
to 32-n compress a temporal, multiframe segment of the available video content known 
in MPEG as a group of frames (GOPs) or a group of frames organized in a temporal, 
multiframe or grouped frames that can be partitioned into multiple frame segments, and 
that the multiple frame segments are compressed by encoders 32-1 to 32-n). 
Therefore, it would have been obvious to one of ordinary skill in the art to combine the 
teachings of Linzer into the system of Lee for permitting accurately, efficiently encoding 
multiple video streaming image data while maintaining high image quality (Linzer col.4, 
ln.39-42). 

Regarding claim 19, Lee discloses multiplexing each portion and transmitting 
each portion in a bitstream (fig. 33 and col.43. In. 10-1 5; note the multiplexer 1510 is 
used to multiplex and encode video portions from plural video object encoders 1504- 
1508; and fig. 36, note the coder shown is used to encode the video portions). 

Regarding claim 20, Lee discloses locating subsegments and regions of interest 
in the extracted portions (fig. 33, element 1502, col.42, In. 34-46, and fig. 35, note video 
object information is extracted and segmented from the input video sequence, and 
segments and subsegments of the regions of interest are identified, and in fig. 35 
discloses extracting multiple video objects 1540, 1542 and 1544b; fig.27A, note there 



Application/Control Number: 09/874,872 Page 20 

Art Unit: 2621 

are at least two video portions, elements 972, 974, 976, 978, 980 and 982; fig. 35, note 
frame 1538 consists of multiple portions 1540, 1542, 1544a and 1544b). 

Regarding claim 21, Lee discloses a method of encoding a bitstream using a 
plurality of encoders, the method comprising: 

mapping each of at least two segments extracted from video content to a 
predefined encoder model (col.42, ln.47-61 ; note each video object has an arbitrary 
shape, and that each video object is predefined according to its shape, thus, each video 
object or video portion is assigned a predefined encoder model by a mask of alpha 
values or a binary mask; in fig.27A, note there are at least two video portions, elements 
972, 974, 976, 978, 980 and 982, where there are triangular portions that consist of 
each of elements 972, 974, 976, 978, 980 and 982 to form a model of a person 970; 
fig. 35, note frame 1538 consists of multiple portions 1540, 1542, 1544a and 1544b); 
and 

routing the at least two extracted and mapped segments to one of the plurality of 
encoders based on the mapping to the respective predefined encoder model (col.42, 
ln.47-61 , Lee discloses that each video object has an arbitrary shape, and that each 
video object is predefined according to its shape, thus, each video object or video 
portion is routed or assigned a predefined encoder model by a mask of alpha values or 
a binary mask), 

wherein the mapping each of at least two segments extracted from the video 
content to a predefined encoder model (col.42, ln.47-61; note each video object has an 
arbitrary shape, and that each video object is predefined according to its shape, thus, 
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each video object or video portion is assigned a predefined encoder model by a mask of 
alpha values or a binary mask; in fig.27A, note there are at least two video portions, 
elements 972, 974, 976, 978, 980 and 982, where there are triangular portions that 
consist of each of elements 972, 974, 976, 978, 980 and 982 to form a model of a 
person 970; fig. 35, note frame 1538 consists of multiple portions 1540, 1542, 1544a and 
1544b) further comprises: 

comparing descriptors associated with each of the at least two extracted 
segments with corresponding stored model descriptors from the plurality of predefined 
encoder models (col. 51 , ln.4-59; note there are plural flags that can aid the 
determination of the video portions of the video content; col. 50, In. 18-41, Lee discloses 
the comparison of the frames, in particular, the comparison is done with the shape of 
the first frame that contains its respective video portions and the shape of the second 
frame that contains its respective video portions), and 

mapping each of the at least two extracted segments to a respective best 
encoder model from the plurality of predefined encoder models based on the comparing 
(col. 50, In. 27-37, the error computed from the inter-frame shape coding is then applied 
to assign the best content model based on the interframe comparison of the shapes 
between the first and second frame data). 

Lee does not specifically disclose wherein each of the at least two video portions 
comprise a temporal, multiframe segment of the video content. However, Linzer 
teaches that each of the at least two video portions comprise a temporal, multiframe 
segment of the video content (fig. 3, Linzer discloses that elements 32-1 to 32-n are the 
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plural MPEG encoders, as disclosed in col. 6, In. 56-58, in that each of the encoders 32-1 
to 32-n compress a temporal, multiframe segment of the available video content known 
in MPEG as a group of frames (GOPs) or a group of frames organized in a temporal, 
multiframe or grouped frames that can be partitioned into multiple frame segments, and 
that the multiple frame segments are compressed by encoders 32-1 to 32-n). 
Therefore, it would have been obvious to one of ordinary skill in the art to combine the 
teachings of Linzer into the system of Lee for permitting accurately, efficiently encoding 
multiple video streaming image data while maintaining high image quality (Linzer col.4, 
ln.39-42). 

Regarding claim 22, Lee discloses locating subsegments and regions of interest 
in the extracted segments (fig. 33, element 1502, col.42, In. 34-46, and fig. 35, note video 
object information is extracted and segmented from the input video sequence, and 
segments and subsegments of the regions of interest are identified, and in fig. 35 
discloses extracting multiple video objects 1540, 1542 and 1544b; fig.27A, note there 
are at least two video portions, elements 972, 974, 976, 978, 980 and 982; fig. 35, note 
frame 1538 consists of multiple portions 1540, 1542, 1544a and 1544b). 

Regarding claims 27-29, Lee discloses a coded bitstream having portions of the 
bitstream encoded using different encoders according to encoder models associated 
with a subject matter of each portion of the bitstream, the coded bitstream encoded 
according to the method of claims 1,18 and 21, respectively (fig. 33 and col.42, ln.62- 
65; note different video object coders 1504-1508 encode video portions associated with 
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the generic model; col.43, In. 10-1 5; note the multiplexer 1510 is used to multiplex and 
encode video portions from plural different video object encoders 1504-1508). 

Regarding claim 30, Lee discloses wherein the assigning a predefined model to 
each of at least two video content portions of the video content portions of the video 
content further comprises assigning a different predefined model to each of the at least 
two video content portions of the video content (col.42, ln.47-61; note each video object 
has an arbitrary shape, and that each video object is predefined according to its shape, 
thus, each video object or video portion is assigned a predefined encoder model by a 
mask of alpha values or a binary mask; in fig.27A, note there are at least two video 
portions, elements 972, 974, 976, 978, 980 and 982, where there are triangular portions 
that consist of each of elements 972, 974, 976, 978, 980 and 982 to form a model of a 
person 970; fig. 35, note frame 1538 consists of multiple portions 1540, 1542, 1544a and 
1544b; fig. 33 and col.42, ln.62-65; note different video object coders 1504-1508 encode 
video portions associated with the generic model; col.43, In. 10-1 5; note the multiplexer 
1510 is used to multiplex and encode video portions from plural different video object 
encoders 1504-1508). 

Regarding claim 31, Lee discloses wherein the assigning a predefined encoder 
model to each of at least two video portions according to a characteristic of each of the 
at least two video further comprises assigning a different predefined encoder model to 
each of the at least two video portions of the video content (col.42, ln.47-61 ; note each 
video object has an arbitrary shape, and that each video object is predefined according 
to its shape, thus, each video object or video portion is assigned a predefined encoder 
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model by a mask of alpha values or a binary mask; in fig.27A, note there are at least 
two video portions, elements 972, 974, 976, 978, 980 and 982, where there are 
triangular portions that consist of each of elements 972, 974, 976, 978, 980 and 982 to 
form a model of a person 970; fig. 35, note frame 1538 consists of multiple portions 
1540, 1542, 1544a and 1544b; fig. 33 and col.42, ln.62-65; note different video object 
coders 1504-1508 encode video portions associated with the generic model; col.43. 
In. 10-1 5; note the multiplexer 1510 is used to multiplex and encode video portions from 
plural different video object encoders 1504-1508). 

Regarding claim 32, Lee discloses wherein the assigning the video content 
portion to a related, predefined encoder model chosen from the plurality of predefined 
encoder models further comprises assigning each of the at least two video content 
portions of the video content to a different one of the predefined encoder models 
(col.42, ln.47-61; note each video object has an arbitrary shape, and that each video 
object is predefined according to its shape, thus, each video object or video portion is 
assigned a predefined encoder model by a mask of alpha values or a binary mask; in 
fig.27A, note there are at least two video portions, elements 972, 974, 976, 978, 980 
and 982, where there are triangular portions that consist of each of elements 972, 974, 
976, 978, 980 and 982 to form a model of a person 970; fig. 35, note frame 1538 
consists of multiple portions 1540, 1542, 1544a and 1544b; fig. 33 and col.42, ln.62-65; 
note different video object coders 1504-1508 encode video portions associated with the 
generic model; col.43, In. 10-1 5; note the multiplexer 1510 is used to multiplex and 
encode video portions from plural different video object encoders 1504-1508). 



Application/Control Number: 09/874,872 Page 25 

Art Unit: 2621 

Regarding claim 33, Lee discloses wlierein assigning each of the at least two 
portions to a respective best encoder model from the plurality of predefined encoder 
models based on the comparing of the descriptors further comprises assigning each of 
the at least two portions to a different one of the plurality of predefined encoder models 
(col. 50, ln.27-37, the error computed from the inter-frame shape coding is then applied 
to assign the best content model based on the interframe comparison of the shapes 
between the first and second frame data). 

Regarding claim 34, Lee discloses the associating each of the at least two 
portions of the video content to either a generic encoder model or a predefined encoder 
model further comprises associating each of the at least two portions of the video 
content to a different encoder model chosen from the generic encoder model of the 
plurality of predefined encoder models (col.42, ln.47-61; note each video object has an 
arbitrary shape, and that each video object is predefined according to its shape, thus, 
each video object or video portion is assigned a predefined encoder model by a mask of 
alpha values or a binary mask; in fig.27A, note there are at least two video portions, 
elements 972, 974, 976, 978, 980 and 982, where there are triangular portions that 
consist of each of elements 972, 974, 976, 978, 980 and 982 to form a model of a 
person 970; fig. 35, note frame 1538 consists of multiple portions 1540, 1542, 1544a and 
1544b; fig. 33 and col.42, ln.62-65; note different video object coders 1504-1508 encode 
video portions associated with the generic model; col.43, In, 10-1 5; note the multiplexer 
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1510 is used to multiplex and encode video portions from plural different video object 
encoders 1504-1508). 

Regarding claim 35, Lee discloses wherein the mapping each of at least two 
segments extracted from the video content to a predefined encoder model further 
comprises mapping each of the at least two segments to a different predefined encoder 
model (col.42, ln.47-61, Lee discloses that each video object has an arbitrary shape, 
and that each video object is predefined according to its shape, thus, each video object 
or video portion is routed or assigned a predefined encoder model by a mask of alpha 
values or a binary mask; fig. 35, note frame 1538 consists of multiple portions 1540, 
1542, 1544a and 1544b; fig. 33 and col.42, ln.62-65; note different video object coders 
1504-1508 encode video portions associated with the generic model; col.43, ln.10-15; 
note the multiplexer 1510 is used to multiplex and encode video portions from plural 
different video object encoders 1504-1508). 
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